Parallel and Distributed Mining of Association Rule on Knowledge Grid
نویسندگان
چکیده
In Virtual organization, Knowledge Discovery (KD) service contains distributed data resources and computing grid nodes. Computational grid is integrated with data grid to form Knowledge Grid, which implements Apriori algorithm for mining association rule on grid network. This paper describes development of parallel and distributed version of Apriori algorithm on Globus Toolkit using Message Passing Interface extended with Grid Services (MPICHG2). The creation of Knowledge Grid on top of data and computational grid is to support decision making in real time applications. In this paper, the case study describes design and implementation of local and global mining of frequent item sets. The experiments were conducted on different configurations of grid network and computation time was recorded for each operation. We analyzed our result with various grid configurations and it shows speedup of computation time is almost superlinear. Keywords—Association rule, Grid computing, Knowledge grid, Mobility prediction.
منابع مشابه
Association rule mining and load balancing strategy in grid systems
The parallel and distributed systems represent one of the important solutions proposed to ameliorate the performance of the sequential association rule mining algorithms. However, parallelization and distribution process is not trivial and still facing many problems of synchronization, communication, and workload balancing. Our study is limited to the workload balancing problem. In this paper, ...
متن کاملA Comparative Study of Association Rule Mining Algorithms on Grid and Cloud Platform
Association rule mining is a time consuming process due to involving both data intensive and computation intensive nature. In order to mine large volume of data and to enhance the scalability and performance of existing sequential association rule mining algorithms, parallel and distributed algorithms are developed. These traditional parallel and distributed algorithms are based on homogeneous ...
متن کاملApplication of Parallelized Apriori in Grid Computing Environment
The goal of the strategy is to improve the performance of distributed algorithms and better their responsiveness. The association rule mining algorithms has high computational complexity due to the size of its search space and the high demands of data access. The work aims at mining the data in a grid computing environment, which computes by distributing the data to its clusters and mines it in...
متن کاملParallel and Distributed Data Mining on Grids
digital data repositories, yet it is often difficult to understand what is the important and useful information in those massive data sets. To sift large data sources, computer scientists designed software techniques and tools that can analyze data to find useful patterns—these techniques contribute to the so-called knowledge discovery in databases (KDD) process. In particular, data mining is t...
متن کاملDesign and Analysis of a Dynamic Load Balancing Strategy for Large-Scale Distributed Association Rule Mining
Association rule mining is one of the most important data mining techniques. Algorithms of this technique search a large space, considering numerous different alternatives and scanning the data repeatedly. Parallelism seems to be the natural solution in order to be able to work with industrial-sized databases. Large-scale computing systems, such as Grid computing environments, are recently rega...
متن کامل